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A MULTI-MESH FINITE ELEMENT METHOD FOR LAGRANGE 
ELEMENTS OF ARBITRARY DEGREE 

AXEL VOIGT* AND THOMAS WITKOWSKit 

Abstract. We consider within a finite element approach the usage of different adaptively refined 
meshes for different variables in systems of nonlinear, time-depended PDEs. To resolve different so- 
lution behaviours of these variables, the meshes can be independently adapted. The resulting linear 
systems are usually much smaller, when compared to the usage of a single mesh, and the overall 
computational runtime can be more than halved in such cases. Our multi-mesh method works for 
Lagrange finite elements of arbitrary degree and is independent of the spatial dimension. The ap- 
proach is well defined, and can be implemented in existing adaptive finite element codes with minimal 
vN effort. We show computational examples in 2D and 3D ranging from dendritic growth to solid-solid 

j^,^ phase-transitions. A further application comes from fiuid dynamics where we demonstrate the ap- 

j^ plicability of the approach for solving the incompressible Navier-Stokes equations with Lagrange 

^L^ finite elements of the same order for velocity and pressure. The approach thus provides an easy to 

^^^ implement alternative to stabilized finite element schemes, if Lagrange finite elements of the same 

order are required. 

1. Introduction. Nowadays, adaptive /i-metliods for mesti refinement are a 

i~~i standard teclrnique in finite element codes. They are used to resolve a mesh due 

"^ to the local behaviour of the solution. When solving PDEs with multiple variables, 

[^ e.g. velocity and pressure in the Navier-Stokes equations, the mesh has to be adapted 

to the behaviour of all components of the solution. If these behaviours are different, 

the use of a single mesh may lead to an inefficient numerical method. In this work we 

propose a multi-mesh finite element method that makes it possible to resolve the local 

S nature of different components independently of each other. This method works for 

Lagrange elements of arbitrary degree in any dimension. Furthermore, the method 

^ works "on the top" of standard adaptive finite element methods. Hence, only small 

^ changes are required if our method has to be implemented in existing finite element 

OO codes. We have implemented the multi-mesh method in our finite element software 

AMDiS (adaptive multidimensional simulations), see pL,, for Lagrange finite elements 

up to fourth degree for ID, 2D and 3D. 

To our best knowledge, Schmidt [2] was the first who has considered the use of 
multiple meshes in this context. Li et al. have introduced a very similar technique 
and used it to simulate dendritic growth [31 131 [S] . Although introducing a multi-mesh 
technique, in none of these publications the method is formally derived. Furthermore 
;V implementation issues are not discussed and detailed runtime results, which compare 

. ^H the overall runtime between the single-mesh and the multi-mcsh method are miss- 

^C ing. In contrast, in this work we will formally show how multiple meshes are used 

in the context of assembling matrices and vectors in the assembly step of the finite 
element methods and will discuss issues related to error estimates for each compo- 
nent. Furthermore, we will compare the runtimes of both methods and show that 
the multi-mesh method is superior to the single-mesh method, when one component 
of the PDF can locally be resolved on a coarser mesh. Solin et al. [6l [7] have also 
introduced a multi-mesh method, but for hp-FEM. Their method is based on trans- 
forming quadrature points which is harder to implement in existing finite element 
codes. Furthermore, also in these works detailed runtime studies are missing. We 
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Fig. 2.1. (a) A two-dimensional macro mesh, (b) Some refinements of it. (c) Binary tree of 
macro element 



should further mention other approaches which are commonly used to deal with dif- 
ferent meshes for different components of coupled systems. Especially in the case of 
multi-physics applications a need exists to couple independent simulations code. A 
standard tool which can be used to couple various finite element codes is MpCCI 
(mesh-based parallel code coupling interface) [S]. In this approach an interpolation 
between the different solutions from one mesh to the other is performed which for 
different resolutions of the involved meshes will lead to a loss in information and is 
thus not the method of choice for the problems to be discussed in this work. 

The paper is structured as follows: In the next section we give a brief overview on 
adaptive meshes, especially in the context of AMDiS, and introduce the terminology 
used throughout this paper. Section 3 introduces the so called virtual mesh assem- 
bling, which is the basis of our multi-mesh method. It is shown, how the coupling 
meshes are build in a virtual way and how the corresponding coupling operators are 
assembled on them. In Section 4, we present several numerical experiments in 2D and 
3D that show the advantages of the multi-mesh method. The last section summarizes 
our results. 

2. Adaptive meshes. The usage of an adaptive mesh, together with error esti- 
mators and a refinement strategy, is a standard finite element technique to compute 
solution of PDEs with a given accuracy with the lowest possible computational effort. 
For a general overview on this topic see for example [^, and references therein. In 
this section, we describe how adaptive meshes and the associated algorithms are im- 
plemented in our finite element software AMDiS. The data structures that are used to 
store and manipulate adaptive meshes, are the basis for a fast and efficient multi-mesh 
method, as it is presented in the next section. 

A mesh in AMDiS consist of simplicial elements, which are lines in ID, triangles 
in 2D and tetrahedral in 3D. If an element has to be refined within the adaption 
loop, it will be bisected into two elements of the same dimension. The refinement 
algorithm, that is implemented in AMDiS, is described in [IQJ in more detail. The 
two new elements are called children of the parent element. The coarsest elements 
are called macro elements. Accordingly, the macro mesh is the union of all macro 
elements. The refinement history of a macro element is represented by a binary tree. 
If a node in this tree is a leaf, the corresponding element is not refined. Otherwise, the 
children of the node represent the children of the element. To avoid hanging nodes 
(vertices of an element which are not vertices of the neighbour element), it may be 
necessary to first refine a neighbour of an element before refining the element itself. 
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In this way, a single refinement can cause some propagation refinements of elements in 
the neighbourhood. In Figure [2?H a triangular macro mesh consisting of four elements 
(a), some refinements of this macro triangulation (b) and the corresponding binary 
tree for macro element (c) are shown. 

All topological and geometrical information are only stored for the macro ele- 
ments. To get these information for other elements in the mesh, an algorithm called 
mesh traverse is used. This algorithm traverses all binary trees, corresponding to the 
macro elements, and recursively computes the requested information, e.g. coordinates 
of vertices, for the requested child elements from the information of its parent element. 
The mesh traverse algorithm than calls a function that computes on the element data. 
An algorithm, that makes use of the mesh traverse, can specify the element level on 
which it wants to process. The level of an element is defined to be the depth of its 
node representation in the binary tree. In most cases, the leaf level, i.e., the union of 
all leaves in all binary trees, is used for mesh traverse. 

The additional effort to compute the element data for the leaf mesh multiple 
times is usually less than 1% of the computational time of the function that process 
on the mesh. Instead, a large amount of memory can be saved. In AMDiS, the 
information data for one element requires around 200 bytes per element. Many of 
our simulations make use of around one million elements per processor. Storing all 
element information explicitly for these simulations would require more than 200 
Mbyte of extra memory. 

2.1. Error estimation and adaptive strategies. AMDiS provides different 
methods for error estimation and mesh refinement. Furthermore, due to an abstract 
interface, is is easily possible to implement other methods, that are more adapted to 
a specific PDF. In the following, we describe the most common ones. 

In AMDiS, the standard estimator for the spatial error is the residual error esti- 
mator as for example described by Verfiirth [TT] for general non-linear elliptic PDEs. 
This local element wise estimator defines for each element T of a mesh an indicator, 
depending on the finite element solution Uh, by 

Vriuh) = CoRriuh) + Ci ^ JeM, (2.1) 

ECT 

where Rt is the element residual on element T and Je is the jump residual, defined 
on an edges E. Cq and C'l are user defined constants that can be used to adjust the 
estimator. The global error estimation of a finite element solution is than the sum of 
the local estimates: 



i/p 






Viuh) - 2^ VtM^ , P > 1, (2.2) 

XteT ) 

where T is a partitions of the domain into simplices. The user has to specify an error 
tolerance tol. If r](uh) > tol, the mesh must be refined in some way in order to reduce 
the error. Several strategies are implemented in AMDiS: 

• the maximum strategy, as described by Verfiirth [T^ . 

• the equidistribution strategy, as described by Eriksson and Johnson [13] . 

• the guaranteed error reduction strategy, as described by Dorfler |14J. 

In all of the numerical experiments in Section |4l we make use of the equidistribution 
strategy, which we found out to be easy to use and which provided good results. The 
basis for this strategy is the idea that the error is assumed to be equidistributed over 
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all elements. Hence, if the partition consists of n elements, the element error estimates 
should fulfill 

VriUh) ~ —^ = VeqiUh)- (2.3) 

The equidistribution strategy now introduces two parameters. Or G (0, 1) and 9c € 
(0,1). An element is refined, if riT{uh) > dRTjeqiuh), and the element is coarsen, if 
VTiuh) < 0cVeq{uh)- The parameter 0^ is usually chosen to be close to 1, and 0c' to 
be close to 0, respectively. 

If the equation consists of more than one variable, one can define multiple error 
estimators and mesh adaption strategies. This is also the case, if only one mesh is used 
to resolve all variables. The error estimators work independently of each other, and 
the user can provide the constants Cg and Ci for each component. Correspondingly, 
independent mesh adaption strategies may be defined for each component. In the case 
of the single-mesh method, an element is refined, if at least one strategy has marked 
the element to be refined. An element is coarsen, if all strategies have marked it to be 
coarsen. In the multi-mesh method, these conditions are omitted, because the meshes 
are adapted independently of each other. 

3. Virtual mesh assembling. The basis of our multi-mesh method is the so 
called virtual mesh assembling. Systems of PDEs usually involve coupling terms. If 
each component of the system is assembled on a different mesh, special care has to 
be devoted to these coupling terms. In the next section, we shortly describe this 
situation. The section |3.2| than introduces the dual m,esh traverse. This algorithm 
create a virtual union of two meshes without creating it explicitly. To the last, we 
show how to compute integrals, that appear within the assemble procedure, on these 
virtual meshes. 

3.1. Coupling terms in systems of PDEs. To illustrate the techniques pre- 
sented in this section, we consider the homogeneous biharmonic equation as a simple 
example for general systems of PDEs. This equation reads: 

A^u = in r^ and u = — = on dQ, (3.1) 

on 

with u e C^(51) n C^{Cl). Using operator splitting, the biharmonic equation can be 
rewritten as a system of two second order elliptic PDEs: 

-Au + v = , ^ 

Aw = ^ ' 

The standard mixed variational formulation of this system is: Find {u,v) G HQ^fl) x 
H^{Q) such that 

/ VwV^da; + / vcpdx = V(/> G H\n) 

Jn Jn 



Jn 



(3.3) 



To discretize these equations, we assume that 7^" and Tfj; are different partitions 
of the domain n into simplices. Than, V^" = {vh e H^ : w/i|t G -P" VT G T^} 
and Vfj^ — {vh G Hq : Vhlr G -P" VT G Til} are finite element spaces of globally 
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continuous, piecewise polynomial functions of an arbitrary but fixed degree. We thus 
obtain: Find {uh,Vh) G V,° x Vf^ such that 

'^UhS/(J3dx + / Vh(t)dx = V0 e Vl^{n) 

n Jn ^34^ 

/ Vvh'^i^dx = V^' e V^^(f^). 

Jn 

Let define {(/)i \ 1 < i < n} and {i/ii | 1 < i < m} to be the nodal basis of V^ 
and Vf^ , respectively. Hence, Uh and Vh can be written by the linear combinations 
u/i — X]i=i ^i'/'j ^'^'i ''^h — X]i=i ^iV'ij with Ui and vt the unknown real coefficients. 
Using these relations and braking up the domain in the partitions of Q, Equation 



(3.4) rewrites to 



I]"J 










i ~ 1 . . .m. 



(3.5) 

To compute the coupling term, we have to define the union of two different partitions 
Tj^ U T^. For this, we make a restriction on the partitions: Any element T'^ £ Tj^ 
is either a subelement of an element T^ £ T^, or vice versa. This restriction is not 
very strict. It is always fulfilled for the standard bisectioning algorithm, if the initial 
meshes for all components share the same macro mesh. Then, T^ U Tj^ is the union 
of the locally finest simplices. 



The most common way to compute the integrals in Equation (3.5) is to define 
local basis functions. We define (j)i,j to be the j'-th local basis function on an element 
Ti e T^ . Tpij is defined in the same way for elements in the partition 7^^. 

Because the global basis functions <j)i and tpj are defined on different triangulations 
of the same domain, it is not straightforward to calculate the coupling term J^ il)j4)i 
in an efficient manner. For evaluating this integral, two different cases may occur: 
either the integral has to be evaluated on an element from the partition T^ or on 
an element from 7^^. For what follows, we fix the first case. In terms of local basis 
functions we have to evaluate 

/ i'k,i4>i,j (3.6) 

JT,er° 

for some j and I, and there exists an element Tk G 7^^, with Ti G T^. Our aim is to 
develop a multi-mesh method that works on the top of existing finite element software. 
These have usually implemented special methods to evaluate local basis functions, or 
the multiplication of two basis functions, respectively, on elements very fast. This 
involves for example precalculated integral tables or fast quadrature rules. All these 
methods cannot directly be applied to the coupling terms, because ipk,i in Equation 



(3.6) is not a local basis function of the element Ti. The general idea to overcome 
this problem is to define the basis functions i/jj, ; by a linear combination of local basis 
functions of T. Thus, 



X 



'ipk,i(i>t,j = / E('^'='™'^*'™)'^*j' ^^■'^^ 
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Fig. 3.1. Dual mesh traverse on two independent refined meshes on the same macro mesh. 



with some real coefRcients Ck,m- For the other case, i.e., the integral in the coupling 
term is evaluated on an element Ti G T^, we have 



V'fc,/0J 



T.GTi 



/ V'fc.iV'(Cj,mV'fc.i 



(3.8) 



Summarized, to make evaluate of the coupling terms possible, two different tech- 
niques have to be defined and implemented: Firstly, the method requires to build a 
union of two meshes. This leads to an algorithm which we name dual mesh traverse. 
It will be discussed in the next section. Once the union is obtained, we need to cal- 
culate the coefficients Qj and to incorporate them in the finite element assemblage 
procedure such that the overall change of the standard method is as small as possible. 

3.2. Creation of the virtual mesh. The simplest way to obtain the union of 
two meshes is to employ the data structure they are stored in. Hence, in our case we 
could explicitly build the union by joining the binary trees of both meshes into a set 
of new binary trees. Especially when we consider meshes that change in time, this 
procedure is not only too time-consuming but requires also additional memory to store 
the joined mesh. To avoid this, we exploit the case that in AMDiS functions, that 
process element data, never directly work on the mesh data but instead use the mesh 
traverse algorithm, that creates the requested element data on demand. According to 
this method, we define the dual mesh traverse that traverses two meshes in parallel 
and thus creates the union of both meshes in a virtual way. 

For the dual mesh traverse the only requirement is that both meshes must share 
the same macro mesh, but they can be refined independently of each other. Due to this 
requirement and because of the bisectioning refinement algorithm the following holds: 
If the intersection of two elements of two different meshes is non empty, then either 
both elements are equal or one element is a real subelement of the other. To receive 
the leaf level of the virtual mesh, the dual mesh traverse simultaneously traverses two 
binary trees, each corresponding the the same macro element in both macro meshes. 
The algorithm than calls a user defined function, e.g. the element assembling function 
or an element wise error estimator, that works on pairs of elements, with both, the 
larger and the smaller element of the current traverse. The larger of both elements is 



fixed as long as all smaller subelements in the other mesh are traversed. Figure 3.1 
shows a simple example for a macro mesh consisting of four macro elements. In the 
first mesh macro element 0, and in the second mesh macro element 1 are refined once. 



3.3. AssembUng of element matrices. 



fast as possible, the form as given by the Equations (3.7) and (3 



To make the overall calculation as 
is not appropriate. 



To implement these transformations, some changes of the inner assemblage procedure 

6 



would be required. In the following, we show in which way the transform matrices 
can be incorporated in the standard assemble procedure, such that the changes will 
be as small as possible. To the first, we have to distinguish two cases: the smaller of 
both elements defines either the space of test functions, or it defines the space of trial 
functions. For the first case, assume a general zero order term of the form J^ ipiC(j>j, 
with c G L°°{il) and some local basis functions 4'i and (pj, has to be assembled on a 
virtual mesh. Then, for some elements T and T', with T' C T, the element matrix 
Mt' is given by 



Jt' V'oC'^o • ■ • Jt' ^oc4>n 



Mt' 



jT'E»(Cm(/'»)c0O ••• /y, Ei(c™'/'Oa/'ny 
\l2i (^ni frp, 4liC4>a ... X^i C„i /p, 4liC4>n/ 

where (pi are the local basis function defined on T' , ^i are the local basis function 
defined on T, and C is the transformation matrix for the local basis function from T 
to T' . This shows, that to assemble the element matrix of a virtual element, there 
is no need for larges changes within the assemble procedure. The finite element code 
needs only to assemble the element matrix of the smaller element T' and multiply the 
result with the transformation matrix. Hence, if the transformation matrices can be 
computed easily, the overhead for virtual element assembling can be neglected. 

The second case, where the smaller element defines the space of trial functions, 
is very similar. The same calculation as above shows that the following holds: 

: ; -C^ (3-10) 

/y, cjinCCpo ... /j,, (j)nC(t)n) 

In a similar way we can also reformulate the element matrices for general first 
and second order terms. For a general second order term of the form j^ Vipi ■ AVcpj, 
with A : fi M> M''^'', the element matrix Mt' can be rewritten in the same way as we 



have done it in (3.9 1 for general zero order terms: 

//^, VVo • AV(/)o ... J^,ViJo-AV(t>n' 

Mr' = : : 

V/^, VV„ • AV./.0 ... J^, VV'n • AV<?!>„ 

//^, V0O • AV^o • • • Jr' V</>o • AV(/)„' 
= Cv • : ; 

V/^, V0„ • AV0O • • • Jr' '^4>n ■ AV(/.,, 



(3.11) 



where the coefficients of the matrix Cv are defined such that 

n 

If the smaller element defines the space of trial functions, we can establish the same 



relation as in (3.10). For general first order terms of the form J^ipih ■ V0j, with 
b £ [L°°{il)]'^, it is simple to check that for the case the smaller element defines 
the test space, the element matrix Mt' can be calculated on the smaller element 
and multiplied with C from the left. If the smaller element defines the space of trial 
function, the element matrix calculated on the smaller element must be multiplied 
with Cy from the right. 

3.4. Calculation of transformation matrices. We have shown that if the 
transformation matrix is calculated for a given tuple of small and large element, the 
additional cost for virtual mesh assembling is very small. In this section, we show 
how to compute these transformation matrices efficiently. We formally define a virtual 
element pair by the tuple 

(T, {ao, . . . , an}) = (T, a) with a^ £ {L, R}, (3.13) 

where T is the larger element of the pair and a is an ordered set that is interpreted 
as the refinement sequence for element T. Thus, L denotes the "left" , and R denotes 
the "right" children of and element. Furthermore, we define a function TRA that 
uniquely maps a virtual element pair to the smaller element. It is defined recursively 
by: 

TRA{T, ^)^T 

rruA(rri ^ w {TRA{Ti,,{ai,...,an}) ifao^L 

\TRA{T^,{ai,...,an\) if ao = R, 

where Tl is the left child of the element T, and Tr the right child of the element, 
respectively. In the same way we can now define transformation matrices as functions 
on refinement sequences: 



C({ao,ai, 




,a„}) ifQ;o = L 
. ,a„}) ifao = R, 



where Cl and Cjj are the transformation matrix for the left child and the right child, 
respectively, of the reference element. 

3.5. Implementation issues. Although the calculation of transformation ma- 
trices is quite fast, it can considerably increase the time for assembling the linear 
system. This is especially the case, if one mesh is much coarsen in some regions than 
the other mesh. To circumvent this problem, we have implemented a cache, that 
stores the transformation matrices. In the mesh traverse routine, an 64 bit integer 



data type stores the refinement sequence bit-wise, as it is defined by (3.13). If the 
bit on the i-th position is set, the finer element is a right-refinement of its parent 
element, otherwise it is a left-refinement of it. Of coarse this limits the level gap 
between the coarser and the finer level to be less or equal to 64. But we have not 



found any practical simulations, where this value is more than 30. Using this data 
type makes it than easy to define associative array that uniquely maps a refinement 
sequence to a transformation matrices. If a transformation matrix was computed for 
a given refinement sequence for the first time, it will be stored in this array. To access 
previously computed matrices using the integer key is than very cheap. In general 
this data structure should be restricted to a fixed number of matrices to not spend 
to much of memory. In all of our simulations, the number of matrices that should be 
stored in the cache never exceed 100.000. Also in the 3D case with linear elements the 
overall memory usage is than around 2 Mbyte, and can thus be neglected. Therefore, 
we have not yet considered to implement an upper limit for the cache. 

4. Numerical results. In this section, we present several examples, where the 
multi-mesh approach is superior in contrast to the standard single-mesh finite element 
method. Examples to be considered are phase-field equations to study solid-liquid and 
solid-solid phase transitions. For a recent review we refer to |15j . These equations 
involve at least one variable, the phase-field variable, which is almost constant in most 
parts of the domain and thus can be discretized within these parts using a coarse mesh. 
Within the interface region however a high resolution is required to resolve the smooth 
transition between the different phases. A second variable in such systems is typically 
a diffusion field which in most cases varies smoothly across the whole domain and 
thus will require a finer mesh outside of the interface and a coarser mesh within the 
interface. Such problems are therefor well suited for our multi-mesh approach. We 
will consider dendritic growth in solidification and coarsening phenomena in binary 
alloys to demonstrate the applicability. 

Other examples for which large computational savings due to the use of the multi- 
mesh approach are expected are diffuse interface and diffuse domain approximations 
for PDE's to be solved on surfaces are within complicated domains. The approaches 
introduced in [16] and [17], respectively, use a phase-field function to implicitly de- 
scribe the domain the PDE has to be solved on. For the same reason as in phase- 
transition problems the distinct solution behaviour of the different variables will lead 
to large savings if the multi-mesh approach is applied. The approach has already 
been used for applications such as chaotic mixing in microfiuidics [T^, tip splitting 
of droplets with soluble surfactants [19] , and chemotaxis of stem cells in 3D scaffolds 

m- 

As a further example we demonstrate that the multi-mesh approach can also be 
used to easily fulfill the inf-sup condition for saddle-point problems if both components 
are discretized using linear Lagrange elements. We demonstrate this numerically for 
the incompressible Navier-Stokes equation with piecewise linear elements for velocity 
and pressure, but a finer mesh used for the velocity. Such an approach might be 
superior to mixed finite elements of higher order or stabilized schemes in terms of 
computational efficiency and implementational efforts. For a review on efficient finite 
element methods for the incompressible Navier-Stokes equation we refer to [21]. We 
demonstrate the applicability of the multi-mesh approach on the classical driven cavity 
problem. 

4.1. Dendritic growth. We first consider dendritic growth using a phase- field 
model, which today is the method of choice to simulate microstructure evolution 
during solidification. For reviews we refer to |22| . A widely used model for quantitative 
simulations of dendritic structures was introduced by Karma and Rappel |23l 124) . 
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which reads in non-dimensional from 



A^{n)dt^ ^{cj>- \u{l - 4>'')){l - <i?) + V(A2(n)V^) + ^ 9,. {\^ c\>\^ A{ny^^ 



dtu = BV'^u + -at<; 



(4.1) 



where d = 2,3 is the dimension, D is the thermal diffusivity constant, A = — , with 
a2 = 0.6267 is a coupling term between the phase-field variable (\) and the thermal 
field u and A is an anisotropy function. For both, simulation in 2D and 3D, we use 
the following anisotropy function: 

^(n)^(l-3.)(l + ^^k^), (4.2) 

where e controls the strength of the anisotropy and n = ry^r denotes the normal to 
the solid- liquid interface. In this setting the phase-field variable is —1 in liquid and 
1 in solid, and the melting temperature is set to be zero. As boundary condition we 
set M = — A to specify an undercooling. For the phase-field variable we use zero-flux 
boundary conditions. 

To implement these equations in our toolbox AMDiS, we first discretize in time. 
This is here done using a semi-implicit Euler method, which yields a sequence of 
nonlinear elliptic PDEs: 



^"^^n+i + ,/ + .9 - V(yl2(n„)V</-„+i) - £[A(n„)] =. ^^^^, 

T T 
DV M„+i - 



(4.3) 



2 r r 2 T 



with / = (1)1^^ - 0„+i, g = A(l - (j)l_^_^fun+i and 

d 



£[A{n^)] = ^a,, ( |V(/)„+i|2A(n„)-^^^''^ 



dxi4'n 

We now linearize the involved nonlinear terms / and g: 

f « {3^1 - l)(/)„+i - 2(/)3 
9~ A(l - (/)^)2u„+i 



(4.4) 



and obtain a linear system for 4'n+i and w„+i to be solved in each time step. 

To compare our multi-mesh method with a standard adaptive finite element ap- 
proach, we have computed a dendrite using linear finite elements. The following 
parameters are used: 

A==0.65,i:» = 1.0,6 = 0.05 

We have run the simulation with a constant timestep r — 1.0 up to time 7500. To 
speedup the computation we have employed the symmetry of the solution and limited 
the computation to the upper right quadrant with a domain size of 800 into each 
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direction. The adaptive mesh refinement rehes on the residuum based a posteriori 
error estimate, as defined by |2.1[ By setting Cq to and Ci to 1, we restrict the 
estimator to the jump residuum only. We have set the tolerance to be tol^ = 0.5 and 
tolu = 0.25. For adaptivity, the equidistribution strategy with parameters Or = 0.8 
and 9c' — 0.2 was used. Thus, the interface is resolved by around 20 points. 
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Fig. 4.1. Evolution of degrees of freedom in time for the phase field and the temperature field 
using single-mesh and multi-mesh method. 

The result of both computations coincides at the final timestep. As a quantitative 
comparison we use the tip velocity of the dendrite. As reported by Karma and Rappel 
[2^ . for this parameter set analytical calculations lead to a steady-state tip velocity 
Vtip = 0.0469. In both of our calculations, the tip velocity varies around 1% to this 
value. Using the single-mesh method, within the final timestep the mesh consist of 
278.252 degrees of freedom (DOFs). When our multi-mesh method is used, the same 
solution can be obtained with a mesh for the phase-field with 256.342 DOFs and 
27.354 DOFs for the temperature field. The gap between these numbers increase 



over time, see Figure 4.1 showing the development of DOFs over time. Figure 4.2 
qualitatively compares the meshes of the phase-field variable and the temperature 
field which shows the expected finer resolution of the phase-field mesh within the 
interface and its coarser resolution within the solid and liquid region. 

The computational time for both methods is compared in Table [4?T| The assem- 
blage procedure in the multi-mesh methods is around 6% faster, although computing 
the element matrices is slower due to the extra matrix-matrix multiplication. This is 
easily explained by the fact that we have much less element matrices to compute and 
the overall matrix data structure is around 50% smaller with respect to the number of 
non-zero entries. This is also reflected in the solution time for the linear system. We 
have run all computation twice, with using either UMFPACK, a multifrontal sparse 
direct solver [25], or the BiCGStab(£) with diagonal preconditioning, that is part of 
the Math Template Library (MTL4), see [2^. When using the first one within the 
multi-mesh method, the solution time can be reduced by 40% and also the memory 
usage, which is often the most critical limitation in the usage of direct solvers, is 
reduced in this magnitude. An even more drastic reduction of the computation time 

11 



can be achieved when using an iterative solver. Here, the number of iterations is 
around 30% less with the multi-mesh method and each iteration is faster due to the 
smaller matrix. The time for error estimation is halved, as expected, since it scales 
linearly with the number of elements in the mesh. Altogether, the time reduction is 
significant in all parts of the finite element method for this example. In addition the 
approach also leads to a drastic reduction in the memory usage. 
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Fig. 4.2. a) 2D dendrite computed att = 7500 using the multi-mesh method with the parameters 
A = 0.65, D = 1.0, £ = 0.05 and a timestep r = 1.0. Left shows the mesh of the temperature field and 
right shows the mesh of the phase field, b) Zoom into the upper tip showing the different resolution 
of both meshes. 





single-mesh 


mult i- mesh 


speedup 


assembler 


5.98s 


5.62s 


6.0% 


solver: UMFPACK 


6.69s 


4.12s 


38.4% 


solver: BiCGStab(£) 


14.26s 


4.28s 


69.9% 


estimator 


3.40s 


1.71s 


49.7% 


overall with UMFPACK 


16.07s 


11.45s 


28.7% 


overaU with BiCGStab(£) 


23.64s 


11.61s 


50.9% 



Table 4.1 

Comparison of runtime when using single-mesh and multi-mesh method. The values are the 
average of 7500 timesteps. 



The results are even more significant in 3D. Figure [4~3| shows the result of com- 
puting a dendrite with the multi-mesh method and the following parameters: 



A = 0.55,D = l,e = 0.05 

We have run the simulation with a constant timestep r = 1.0 up to time 2500. The 
evolution of degrees of freedom over time is quite similar to the 2D example. When 
using the multi-mesh method, the time for solving the linear system, again using 
the BiCGStab(i?) solver with diagonal preconditioning, can be reduced by around 
60%. The time for error estimation is around half the time needed by the single-mesh 
method. Because the time for assembling the linear system is more significant in 3D, 
the overall time reduction with the multi-mesh method is around 24.4%. 



4.2. Coarsening. As a second example we consider coarsening of a binary struc- 
ture using a Cahn-Hilliard equation. We here concentrate only on the phenomenolog- 
ical behaviour of the solution and thus consider the simplest isotropic model, which 

12 




Fig. 4.3. a) 3D dendrites at t = 2500 using the multi-mesh method with the parameters 
A = 0.55, D = 1, e = 0.05 and a timestep t = 1.0. b) Slice through the dendrites showing the 
mesh of the phase field on the left and the mesh of the temperature field on the right hand side. 



reads 



dtcj) = A^ 

Ai = -eA(/.+ -G'(0) 
e 



(4.5) 



for a phase- field function (f> and a chemical potential /i. The parameter e again defines 
a length scale over which the interface is smeared out, and G((/)) = 18</)^(1 — (j))'^ 
defines a double-well potential. To discretize in time we again use a semi-implicit 
Euler scheme 



1 . A 1 

-<Pn+l - A^„+i = -( 
T T 



Mn+1 + eA0„4 



1 



(4.6) 



G'i 



»n+l 



= 



in which we linearize G'(0„+i) w G"((/)„)0„+i -I- G'((/)„) - G"{(j)n)4>n- 

To compare our multi-mesh method with a standard adaptive finite element ap- 
proach, we have computed the spinodal decomposition and coarsening process using 
Lagrange finite elements of fourth order. We use e = 5 • iO~^. The adaptive mesh 
refinement relies on the residuum based a posteriori error estimate. As we have done 
it in the dendritic growth simulation, also here only the jump residuum is considered, 



i.e., the constants Gq and Gi in Equation (2.i| are set to and 1, respectively. For 



both methods, the error tolerance are set to tolc/, — 2.5 • 10^"* and tol^ = 5 • 10^^. For 
adaptivity, the equidistribution strategy with parameters 9ji — 0.8 and 9c — 0.2 was 
used. Using these parameters, the interface is resolved by around iO points. 




Fig. 4.4. Solution of the Cahn-Hilliard equation for t = 0.02, t = 1.0, t = 5.0 and t = 12.50 
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The simulation was started from noise. The first mesh was globally refined with 
196608 elements. The constant timestep was chosen to be r = 10~^. We have 
disabled the adaptivity for the first 10 timesteps, until a very first coarsening in the 
domain was achieved. Then the simulation was executed up to t = 13.0, where both 
phases are nearly separated. Figure 4.4 shows the phase field, i.e., the 0.5 contour of 
the first solution variable, for four different timesteps. The number of elements and 
degrees of freedom is linear to the area of the interface that must be resolved on the 
domain. Indeed, the chemical potential can be resolved on a mush coarser grid, since 
it is independent of the resolution of the phase field. In the final state, the chemical 
potential is constant on the whole domain, and the macro mesh (which consists of 
6 elements in this simulation) is enough to resolve it. The evolution of the number 
of elements for both variable over time is plotted in Figure |4.2| As expected, the 
number of elements for the phase field monotonously decreases as its area shrinks due 
to the coarsening process. The number of elements for the chemical potential rapidly 
decreases at the very first beginning, as the initial mesh is over refined to resolve this 
variable. For most of the simulation, the number of elements of the chemical potential 
is three magnitudes smaller the the number of elements for the phase field variable. 
This gap is also reflected in the computation time for the single-mesh and the multi- 
mesh method. AMDiS has computed the solution with the multi-mesh method in 
722 minutes. When using the standard single-mesh method, where the number of 
elements for the only mesh is nearly equivalent to the number of elements needed 
in the phase field mesh when using the multi-mesh approach, the simulation time 
was 1465 minutes. For both simulations, we have used the BiCGStab(£) solver with 
diagonal preconditioning. 
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Fig. 4.5. Evolution of number of elements for both variables of the Cahn-Hilliard equation. 



4.3. Fluid dynamics. For the last example, we use our multi-mesh method 
to solve problems in fluid dynamics using standard linear finite elements. The inf- 
sub stability is established by using two different meshes. In 2D, the mesh for the 
velocity components is refined twice more than the mesh for pressure. In the 3D case, 
the velocity mesh has to be refined three times to get the corresponding refinement 
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structure. This discretization was introduced by Bercovier and Pironncau [27 , and 
was analyzed and proven to be stable by Verfiirth |28| . To show this technique, we 
solve the standard instationary Navier-Stokes equation given by 



dtu - i/V^u + u-\/u + \/p = f 



(4.7) 



where i/ > is the kinematic viscosity. The time is discretized by the standard 



backward Euler method. The nonlinear term in (4.7 1 is linearized by 



Vii,, 



(4.8) 



The model problem is the "driven cavity" flow, as described and analyzed in 
[^ 150] . In a unit square, the boundary conditions for the velocity are set to be 
zero on the left, right and lower part of the domain. On the top, the velocity into 
X direction is set to be one and into y direction to be zero. In the upper corners, 
both velocities are set to be zero, which models the so called non-leaky boundary 
conditions. The computation was done for several Reynold numbers varying between 
50 and 1000. First, we have used the single-mesh method with a standard Taylor- 
Hood element, i.e., second order Lagrange finite element for the velocity components a 
linear Lagrange finite element for the pressure. Than, we have compared these results 
with our multi-mesh method, where for both components a linear finite elements was 
used and, instead, the mesh for velocity discretization is refine twice more than the 
pressure. All computations were done with a fixed timestep t = 0.01 and aborted, 
when the relative change in velocity and pressure is less than 10~^. Figure 4.3 shows 
the results for Re = 1000. In Table [42} we give the position of all eddies and compare 
our results with the work of Ghia et al. j30j and Wall i29j . 



single mesh 



multi mesh 




Fig. 4.6. Results for Re = 1000. The left column shows 30 pressure isolines in the range 
-0.12,0.12. The right column shows the streamlines contours of the velocity. 
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Eddy 1 


Eddy 2 


Eddy 3 


Eddy 4 


single-mesh 


0.5310, 
0.5658 


0.8633, 
0.1116 


0.0838, 
0.0775 


0.9937, 
0.0062 


multi-mesh 


0.5305, 
0.5671 


0.8669, 
0.1125 


0.0813, 
0.0750 


0.9953, 
0.0062 


Wall 


0.5308, 
0.5660 


0.8643, 
0.1115 


0.0832, 
0.0775 


0.9941, 
0.0066 


Ghia et al. 


0.5313, 
0.5625 


0.8594, 
0.1094 


0.0859, 
0.0781 


0.9922, 
0.0078 



Table 4.2 
Comparison of eddy position for Re = 1000 in the driven cavity model. 



In both, the single-mesh method and the multi-mesh method, all finite element 
spaces have the same number of unknowns. This is the reason, why the usage of 
the latter one is not faster in contrast to the single-mesh method, as it is the case 
in dendritic growth. The time for assembling the linear system growth from 4.13 
seconds to 5.79 seconds, which is mainly caused by the multiplication of the element 
matrices with the transformation matrices. Instead, the average solution time with 
a BiCGStab(£) solver and ILU preconditioning decreases from 10.18 seconds to 8.88 
seconds. Although the linear systems have the same number of unknowns, the linear 
systems resulting from the single-mesh method are denser due to the usage of second 
order finite elements. The number of non-zero entries decreases around 20% when 
linear elements are used on both meshes. 

5. Conclusion. To further improve efficiency of adaptive finite element simula- 
tions we consider the usage of different adaptively refined meshes for different variables 
in systems of nonlinear, time-depended PDEs. The different variable can have very 
distinct solution behaviour. To resolve this the meshes can be independently adapted 
for each variable. Our multi-mesh method works for Lagrange finite elements of arbi- 
trary degree and is independent of the spatial dimension. The approach is well defined, 
and can be implemented in existing adaptive finite element codes with minimal ef- 
fort. We have demonstrated for various examples that the resulting linear systems are 
usually much smaller, when compared to the usage of a single mesh, and the overall 
computational runtime can be more than halved in various cases. Phase transition 
problems within a diffuse interface approach are well suited for our approach. The 
same holds for saddle-point problems in which the inf-sup condition can be fullfilled 
for finite elements of the same order. 

Further examples which are currently under investigation include general diffuse 
interface concepts to solve PDEs in complex domain. Here a phase-field function is 
used to describe the domain implicitly |17j , which only requires a fine resolution along 
the boundaries. The approach might also be used in time stepping schemes to prevent 
loss of information during coarsening. In a classical approach the solution from the 
old time step is simply interpolated to the new mesh at the new time step. If the 
new mesh is coarser information is lost, which can be prevented by using the multi- 
mesh approach for the solution at different time steps. And also in optimal control 
problems the approach is very promizing, as in many situations the dual solution 
is much smoother than the primal solution and thus can be discretized on a much 
coarser mesh using are multi-mesh approach which will be demonstrated for control 
of an AUen-Cahn equation. 
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